Multiply Accelerated Value Iteration for NonSymmetric Affine Fixed Point Problems and Application to Markov Decision Processes
نویسندگان
چکیده
We analyze a modified version of the Nesterov accelerated gradient algorithm, which applies to affine fixed point problems with non-self-adjoint matrices, such as ones appearing in theory Markov decision processes discounted or mean payoff criteria. characterize spectra matrices for this algorithm does converge an asymptotic rate. also introduce $d$th-order and show that it yields multiply rate under more demanding conditions on spectrum. subsequently apply these methods develop schemes nonlinear arising from processes. This is illustrated by numerical experiments.
منابع مشابه
Accelerated decomposition techniques for large discounted Markov decision processes
Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...
متن کاملInteractive Value Iteration for Markov Decision Processes with Unknown Rewards
To tackle the potentially hard task of defining the reward function in a Markov Decision Process, we propose a new approach, based on Value Iteration, which interweaves the elicitation and optimization phases. We assume that rewards whose numeric values are unknown can only be ordered, and that a tutor is present to help comparing sequences of rewards. We first show how the set of possible rewa...
متن کاملFast Value Iteration for Goal-Directed Markov Decision Processes
P lanning problems where effects of actions are non-deterministic can be modeled a8 Markov decision processes. Planning prob lems are usually goal-directed. This paper proposes several techniques for exploiting the goal-directedness to accelerate value itera tion, a standard algorithm for solving Markov decision processes. Empirical studies have shown that the techniques can bring about signi...
متن کاملApproximate Value Iteration for Risk-aware Markov Decision Processes
We consider large-scale Markov decision processes (MDPs) with a risk measure of variability in cost, under the risk-aware MDPs paradigm. Previous studies showed that risk-aware MDPs, based on a minimax approach to handling the risk measure, can be solved using dynamic programming for small to medium sized problems. However, due to the “curse of dimensionality”, MDPs that model real-life problem...
متن کاملTopological Value Iteration Algorithm for Markov Decision Processes
Value Iteration is an inefficient algorithm for Markov decision processes (MDPs) because it puts the majority of its effort into backing up the entire state space, which turns out to be unnecessary in many cases. In order to overcome this problem, many approaches have been proposed. Among them, LAO*, LRTDP and HDP are state-of-theart ones. All of these use reachability analysis and heuristics t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: SIAM Journal on Matrix Analysis and Applications
سال: 2022
ISSN: ['1095-7162', '0895-4798']
DOI: https://doi.org/10.1137/20m1367192